A modular tool to aggregate results from bioinformatics analyses across many samples into a single report.
Report generated on 2016-12-21, 17:12 based on data in:
/Users/philewels/GitHub/MultiQC_website/public_html/examples/wgs/data
General Statistics
Showing 6/6 rows and 13/22 columns.| Sample Name | TiTV ratio (novel) | TiTV ratio (known) | Change rate | Ts/Tv | M Variants | Avg. GC | Insert Size | ≥ 30X | Coverage | % Aligned | % Dups | % GC | M Seqs |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| P4107_1001 | 1.5 | 2.1 | 764 | 1.995 | 4.06 | 41% | 358 | 74.7% | 36.0 | 97.3% | 6.4% | 41% | 383.6 |
| P4107_1002 | 1.5 | 2.1 | 762 | 1.994 | 4.07 | 41% | 367 | 82.3% | 40.0 | 97.8% | 9.9% | 41% | 430.2 |
| P4107_1003 | 1.5 | 2.1 | 761 | 1.994 | 4.07 | 41% | 365 | 82.4% | 40.0 | 97.6% | 10.5% | 41% | 431.4 |
| P4107_1004 | 1.5 | 2.1 | 765 | 1.996 | 4.05 | 41% | 363 | 84.7% | 46.0 | 98.2% | 39.4% | 40% | 498.2 |
| P4107_1005 | 1.5 | 2.1 | 762 | 1.994 | 4.07 | 41% | 368 | 85.3% | 45.0 | 98.0% | 24.5% | 41% | 484.2 |
| P4107_1006 | 1.5 | 2.1 | 761 | 1.993 | 4.07 | 41% | 362 | 84.1% | 43.0 | 98.1% | 12.4% | 41% | 453.2 |
GATK
GATK is a toolkit offering a wide variety of tools with a primary focus on variant discovery and genotyping.
Variant Counts
Compare Overlap
Showing 6/6 rows and 5/5 columns.| Sample Name | Compare rate | Concordant rate | M Evaluated variants | M Known sites | M Novel sites |
|---|---|---|---|---|---|
| P4107_1001 | 45.44% | 99.09% | 3.5 | 3.4 | 1.9 |
| P4107_1002 | 45.75% | 99.09% | 3.5 | 3.4 | 1.9 |
| P4107_1003 | 44.76% | 99.08% | 3.5 | 3.5 | 2.0 |
| P4107_1004 | 45.84% | 99.09% | 3.5 | 3.4 | 1.9 |
| P4107_1005 | 45.82% | 99.10% | 3.5 | 3.4 | 1.9 |
| P4107_1006 | 45.91% | 99.10% | 3.5 | 3.4 | 1.9 |
SnpEff
SnpEff is a genetic variant annotation and effect prediction toolbox. It annotates and predicts the effects of variants on genes (such as amino acid changes).
Variants by Genomic Region
Variant Effects by Impact
Variant Effects by Class
Variant Qualities
QualiMap
QualiMap is a platform-independent application to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.
Coverage histogram
Cumulative coverage genome fraction
Insert size histogram
GC content distribution
The dotted line represents a pre-calculated GC destribution for the reference genome.
Picard
Picard is a set of Java command line tools for manipulating high-throughput sequencing data.
Mark Duplicates
FastQ Screen
FastQ Screen allows you to screen a library of sequences in FastQ format against a set of sequence databases so you can see if the composition of the library matches with what you expect.
FastQC
FastQC is a quality control tool for high throughput sequence data, written by Simon Andrews at the Babraham Institute in Cambridge.
Sequence Quality Histograms
The mean quality value across each base position in the read. See the FastQC help.
Per Sequence Quality Scores
The number of reads with average quality scores. Shows if a subset of reads has poor quality. See the FastQC help.
Per Base Sequence Content
The proportion of each base position for which each of the four normal DNA bases has been called. See the FastQC help.
Click a heatmap row to see a line plot for that dataset.
rollover for sample name
Per Sequence GC Content
The average GC content of reads. Normal random library typically have a roughly normal distribution of GC content. See the FastQC help.
The dashed black line shows theoretical GC content: Human Genome (UCSC hg38).
Per Base N Content
The percentage of base calls at each position for which an N was called. See the FastQC help.
Sequence Length Distribution
All samples have sequences of a single length (151bp).
Sequence Duplication Levels
The relative level of duplication found for every sequence. See the FastQC help.
Overrepresented sequences
The total amount of overrepresented sequences found in each library. See the FastQC help for further information.
Adapter Content
The cumulative percentage count of the proportion of your library which has seen each of the adapter sequences at each position. See the FastQC help. Only samples with ≥ 0.1% adapter contamination are shown.